Multi-Stage Feature Extraction and Classification for Ship-Radiated Noise

Due to the complexity and unique features of the hydroacoustic channel, ship-radiated noise (SRN) detected using a passive sonar tends mostly to distort. SRN feature extraction has been proposed to improve the detected passive sonar signal. Unfortunately, the current methods used in SRN feature extraction have many shortcomings. Considering this, in this paper we propose a new multi-stage feature extraction approach to enhance the current SRN feature extractions based on enhanced variational mode decomposition (EVMD), weighted permutation entropy (WPE), local tangent space alignment (LTSA), and particle swarm optimization-based support vector machine (PSO-SVM). In the proposed method, first, we enhance the decomposition operation of the conventional VMD by decomposing the SRN signal into a finite group of intrinsic mode functions (IMFs) and then calculate the WPE of each IMF. Then, the high-dimensional features obtained are reduced to two-dimensional ones by using the LTSA method. Finally, the feature vectors are fed into the PSO-SVM multi-class classifier to realize the classification of different types of SRN sample. The simulation and experimental results demonstrate that the recognition rate of the proposed method overcomes the conventional SRN feature extraction methods, and it has a recognition rate of up to 96.6667%.


Introduction
Ships are playing an increasingly important role in many military and civilian applications. For example, in military field applications, an effective prediction for enemy ships helps us to take the correct action and activate our countermeasure to avoid enemy attacks and defeat them. For civilian applications, a logical comprehensive analysis of different port noise, including ship-radiated noise (SRN) can help researchers support the reproduction of marine life [1]. For improving the passive sonar operation in ship applications, SNR feature extraction has been proposed [1]. However, marine environmental diversity provides a rich noise environment, and that increases the difficulty of extracting features reflecting the intrinsic characteristics of the ships [1]. In recent years, studies of the SRN feature extraction have increased. Unfortunately, the current SRN feature extraction schemes have many drawbacks. For example, Fourier transform (FT) [1] is only useful in only estimating the signal spectral information, but it is unsuccessful at time-varying representation. To address this drawback, short-time Fourier transform (STFT) has been proposed to indicate the time-varying signal traits. However, a fixed STFT window width makes STFT unable to consider good representation for the time domain and frequency domain at the same time [1]. To overcome the STFT drawbacks wavelet transform (WT) ate measure decreases algorithm to minimize the increases, and in that way support the confession execution. Up to now, researchers have suggested different dimension relief algorithms, such as principal component analysis (PCA) [23,24], independent component analysis (ICA) [24], and linear discriminant analysis (LDA) [25]. PCA recognizes measurement saving by realizing optimal variance without dropping the creative data. The PCA and ICA are linear unsupervised dimension reduction algorithms. LDA is also a linear projection method that achieves dimension reduction by making the most of the ratio of the discrete matrix among classes and the discrete matrix within the class. Due to the SRN non-linear characteristics, linear dimension reduction algorithms are scarce in removing the intrinsic features of SRN signals. As a non-linear manifold learning algorithm, local tangent space alignment (LTSA) has been extensively useful in dimension reduction, thanks to its fast process speed and selfishness to selected parameters [26][27][28][29]. To the best of our knowledge, there is no study combining EVMD, WPE, and LTSA to classify underwater acoustic targets. Each method has its pros and cons, motivating us to combine all methods to have the maximum benefits.
To that end, this paper puts forward a novel multistage feature extraction method proposing an EVMD method and combining it with the WPE, and LTSA for SRN samples classification. In this paper, the proposed EVMD method uses the variance of the IMFs' center frequency to calculate the mode number of VMD and enhance its operation. Next, the new EVMD algorithm is used to decompose the SRN signals plus calculate the WPE of each IMF. Then high-dimensional features are reduced to two-dimensional ones by the LTSA method. Finally, the feature vectors obtained are fed into the PSO-SVM multi-class classifier to recognize the different types of SRN samples.
The structure of the paper is presented as follows: the fundamental theories of the relevant algorithms are described in Section 2. In Section 3, the basic steps of the proposed method are presented. Section 4 applies the proposed method to the analysis of simulated signals. In Section 5, the proposed method is utilized for the feature extraction of SRN. Finally, the paper is concluded in Section 6.

Basic Theory
In this section, the theories of the related methods such as VMD, PE, WPE, and LTSA will be presented.

Variational Mode Decomposition (VMD)
The VMD defines the IMF in the function of the instantaneous amplitude Am k (t) and phase Ph k (t) as an amplitude-modulated-frequency-modulated (AM-FM) signal, given as below: I k (t) = Am k (t) cos(Ph k (t)) (1) where the change of the Am k (t) and .
Ph k (t) are slower than Ph k (t). Each I k (t) is compacted around a respective center frequency with limited bandwidth obtained by Gaussian smoothing demodulation. In the VMD algorithm, decomposing the raw signal s(t) into a finite group of IMFs to find the variational problem can be expressed as follows: where ∂ t , δ(t) and f k represent the partial derivative, impulse function, and center frequency of I k (t), respectively. The constrained variation problem in Equation (2) is addressed using the quadratic penalty term and the Lagrange multipliers below: where α and λ denote the penalty factor and Lagrange multiplier, respectively. I n+1 k , f n+1 k and λ n+1 are updated as follows: where ε represents the update parameter. In this method, the stop condition is given by: where a denotes the convergence accuracy. The VMD algorithm can be summarized in: (1) initialize Î 1 k , f 1 k ,λ 1 and n = 0; (2) update the values of Î n+1 k , f n+1 k andλ n+1 based on Equations (4)-(6); (3) check the covariance condition based on Equation (7), and the details about the VMD algorithm are published in [5].

Permutation Entropy (PE)
PE [15] can not only characterize the randomness of the time series but also detect its dynamic changes. In addition, PE does not consider the amplitude value, but only compares the neighboring values, which makes its operation speed faster. For the given time series x = x j N j=1 , the PE can be reconstructed as: where m is the embedding dimension, τ is the time delay and i = 1, 2, · · · , N − (m − 1)τ. The elements in X i can be rearranged in increasing order as: If two of the rearranged elements are equal, then, hence, the new order can be denoted as: Therefore, the symbols group can be obtained as: where S(g) represents one of m! symbol sequences in phase space, g = 1, 2, · · · , k, and k ≤ m!. If the probability distribution of the symbol sequence is P 1 , P 2 · · · , P k , for convergence, the normalized PE is defined as follows: From Equation (13), we can observe that the value of PE ranges from 0 to 1. H p indicates the randomness of time sequence, a larger H p value means higher complexity of the time series; a smaller H p value means lower uncertainty of the time series.

Weighted Permutation Entropy (WPE)
In the PE the neighboring vectors having the same ordinal patterns but with different amplitude values are unreasonably ignored. The WPE [20] has been proposed to take such a situation into account and overcome the PE shortcomings. In the WPE, forgiven embedding dimension m and time delay τ, first, the weight w i of neighbouring vectors X i is calculated as: Then, the weighted relative frequency is calculated as: Finally, the WPE definition is described below:

Local Tangent Space Alignment (LTSA)
Due to the advantages of insensitivity to parameter selection and fast operation, the local tangent space alignment (LTSA) method has been widely used in dimension reduction in multiple fields [29,30]. The basic idea of the LTSA algorithm is constructing the local tangent space by using the sample neighborhood and mapping the coordinates of the local tangent space corresponding to the global low-dimensional coordinates through the local radiological transformation matrix. Given the data X = {x 1 , x 1 , · · · , x m } ⊂ R M×N , the principle of LTSA can be briefly described as below: (1) Determine the K nearest neighbors of x i to form the set of X i , and centralizeX i , where x i = 1 k ∑ k j=1 x ij and l k is a unit vector with dimension K. (2) Calculate the eigenvalues and eigenvectors of a matrixX i by singular value decomposition. The eigenvectors corresponding to the first d largest singular values are the tangent space H i .
(3) Construct the transformation matrix L i = θ + i , to retain as much information as possible, and the following conditions must be met, where θ + i represents the generalized inverse matrix of θ i , Y i represents the set of nearest neighbors of Y after dimension reduction, that is, Y i = (y i1 , y i2 , · · · , y ik ). (4) Solve the optimization problem of Equation (21) by calculating the eigenvalues and eigenvectors of the matrix, and then the embedding matrix Y can be obtained. Equation (21) can be equivalent to the following equation: The low-dimensional embedding matrix Y can be obtained by calculating the eigenvectors corresponding to the second to d-th smallest eigenvalues of the alignment matrix B and it can be calculated as:

Proposed Feature Extraction Method Based on Enhanced Variational Mode Decomposition (EVMD), Weighted Permutation Entropy (WPE), and Local Tangent Space Alignment (LTSA)
In Section 2 the details of the basic theories of VMD, WPE, and LTSA are presented. However, the model number of VMD needs to be determined in advance, so this paper proposes the EVMD for SRN signal processing. A multi-stage feature extraction method fully inheriting the advantages of EVMD, WPE, and LTSA is proposed in this paper. The flowchart of the proposed method is shown in Figure 1. The main steps of the proposed method can be summarized as follows: (1) Based on the center frequency of the intrinsic mode functions, the variational mode decomposition mode number K will be calculated firstly. (2) The VMD algorithm based on the optimum mode number K obtained in the first step is applied to the decomposition of the SRN signals. (3) The WPE of each IMF by EVMD is then calculated. (4) The feature vectors obtained will be randomly divided into two groups, the first group represent the training data used in classification information while the second one is used for evaluating the testing data and measuring its ability for classification. (5) Definitively, PSO-SVM multiclass classifier uses the analysis data that are recognized and detects the underwater acoustic object.
To extract features of SRN, this paper classifies three types of SRN using a combination of VMD, WPE, LTSA, and PSO-SVM multi-class classifier. In the proposed method, the VMD mode number range is first set according to the EEMD decomposition results, and the variance of the IMFs' center frequency after each decomposition is calculated. The mode number corresponding to the maximum variance is used as an optimum value for VMD. Then, VMD is performed on three SRN signal types. The WPE value of each IMF for each VMD decomposition is calculated. The high-dimensional features obtained are reduced to be two-dimensional features using LTSA. Finally, the obtained feature vectors are input to the PSO-SVM multi-class classifier to achieve classification and recognition of the samples.

Analysis of Simulated Signals Based on EVMD
In this paper, as the SRN signals contain the rich line spectral components, the simulated signals composed of the single-frequency components are introduced. In addition, the simulated signals are feasible as long as they satisfy the measurement signal conditions explained in [12], namely, their frequency interval should not be too small (like 1 Hz, 2 Hz, and 3 Hz). Following the same restriction conditions of [12] and verifying the effectiveness and feasibility of the proposed EVMD method. The simulated signals used in this paper are as follows: where the data length is set to be 5000 with a sampling frequency of 1 kHz and denotes the Gaussian white noise with (0,0.5).
Following the literature [5], before performing the VMD algorithm, the model number needs to be adjusted, which serves as a main factor affecting the VMD performance. Other parameters are set as constants, namely the balancing parameter of the data-fidelity

Analysis of Simulated Signals Based on EVMD
In this paper, as the SRN signals contain the rich line spectral components, the simulated signals composed of the single-frequency components are introduced. In addition, the simulated signals are feasible as long as they satisfy the measurement signal conditions explained in [12], namely, their frequency interval should not be too small (like 1 Hz, 2 Hz, and 3 Hz). Following the same restriction conditions of [12] and verifying the effectiveness and feasibility of the proposed EVMD method. The simulated signals used in this paper are as follows: where the data length is set to be 5000 with a sampling frequency of 1 kHz and η denotes the Gaussian white noise with CN (0, 0.5). Following the literature [5], before performing the VMD algorithm, the model number K needs to be adjusted, which serves as a main factor affecting the VMD performance. Other parameters are set as constants, namely the balancing parameter of the data-fidelity constraint is α = 2000, the convergence tolerance level is tol = 1 × 10 −7 and the update mode of the center frequency is init = 0, 1, 2 for center frequency iterated with 0, uniform distribution, or randomly. Too-large a K will lead to the occurrence of over-decomposition, which means undesirable spurious components will be generated during the decomposition process; too-small a K will cause under-decomposition, which discards some IMFs carrying useful information during the decomposition process. Hence, a properly chosen K value is crucial to the VMD method. Although the VMD method in [13,14] can successfully decompose the SRN signals to some extent, the method for determining the mode number of VMD has not been reasonably demonstrated, making it unacceptable scientifically. In general, the mode number of VMD will not be greater than the mode number of EMD and EEMD, so conventional EMD and EEMD methods are first employed to analyze the simulated signals above. Figure 2 shows the decomposition results and the time domain waveforms of the simulated signals. constraint is = 2000, the convergence tolerance level is tol = 1 × 10 −7 and the update mode of the center frequency is = 0,1,2 for center frequency iterated with 0, uniform distribution, or randomly. Too-large a will lead to the occurrence of over-decomposition, which means undesirable spurious components will be generated during the decomposition process; too-small a will cause under-decomposition, which discards some IMFs carrying useful information during the decomposition process. Hence, a properly chosen value is crucial to the VMD method. Although the VMD method in [13,14] can successfully decompose the SRN signals to some extent, the method for determining the mode number of VMD has not been reasonably demonstrated, making it unacceptable scientifically. In general, the mode number of VMD will not be greater than the mode number of EMD and EEMD, so conventional EMD and EEMD methods are first employed to analyze the simulated signals above. Figure 2 shows the decomposition results and the time domain waveforms of the simulated signals.  2, show modeled indicator decomposed by using the EMD, one remaining part is obtained in addition to 10 IMFs. While in the EEMD decomposed method there are one remaining part and 10 IMFs, and they are separated. Hence, based on setting the variables for the mode number range to be ( 2 ∼ 12), and following calculation of every decomposition IMFs' center frequency, the good mode number K has the maximum variance which maximizes the IMFs center frequency difference. Figure 3, shows the IMFs' center frequency at different mode number K when the VMD decomposing method is used, and Figure 3 shows that K = 9 is the best choice.  Figure 2, show modeled indicator decomposed by using the EMD, one remainin part is obtained in addition to 10 IMFs. While in the EEMD decomposed method there ar one remaining part and 10 IMFs, and they are separated. Hence, based on setting the var iables for the mode number range to be (2~12), and following calculation of every de composition IMFs' center frequency, the good mode number has the maximum var ance which maximizes the IMFs center frequency difference. Figure 3, shows the IMFs center frequency at different mode number when the VMD decomposing method i used, and Figure 3 shows that = 9 is the best choice. When the mode number is set to be more than the good, estimated value = 9 the variance begins to reduce and that shows irrelevant variation among the IMFs' cente frequency and the existence of the done decomposition. Figure 4, shows the decompos tion results of the modeled signals based on the EVMD procedure. The IMFs' center fre quency distribution at different mode numbers are listed in Table 1. Table 1 shows how components are unseparated for = (2~6), and for 7 the modeled signal is sepa rated. For = (10~12) further false components are created to recognize the split of th modeled signals.  When the mode number K is set to be more than the good, estimated value K = 9, the variance begins to reduce and that shows irrelevant variation among the IMFs' center frequency and the existence of the done decomposition. Figure 4, shows the decomposition results of the modeled signals based on the EVMD procedure. The IMFs' center frequency distribution at different mode numbers K are listed in Table 1. Table 1 shows how components are unseparated for K = ( 2 ∼ 6), and for K ≥ 7 the modeled signal is separated. For K = ( 10 ∼ 12) further false components are created to recognize the split of the modeled signals.  Based on the analysis above, the proposed EVMD method using the variance of the IMFs' center frequency is feasible in calculating the mode number of VMD. To further verify this method in the VMD algorithm, the VMD method in [4,20] is also introduced for comparison. The correlation coefficients between the corresponding IMF and simulated signals are calculated and the results are listed in Table 2. As shown in Table 2, the corresponding three components of the proposed EVMD have the highest correlation co-  Based on the analysis above, the proposed EVMD method using the variance of the IMFs' center frequency is feasible in calculating the mode number of VMD. To further verify this method in the VMD algorithm, the VMD method in [4,20] is also introduced for comparison. The correlation coefficients between the corresponding IMF and simulated signals are calculated and the results are listed in Table 2. As shown in Table 2, the corresponding three components of the proposed EVMD have the highest correlation coefficients with the simulated signals, and its decomposition performance is significantly better than the conventional methods of the EMD, EEMD, and VMD. This confirms the validity and feasibility of the method proposed in this paper for determining the VMD mode number. According to the basic principles of WPE and PE explained in Section 2, when the signal is mutated, it is difficult for PE to detect this state, while WPE should be more sensitive to this mutation. To validate the conjecture, we generate a standard Gaussian white noise series with a length of 5000. As the pulse series means a larger fluctuation, we add the pulse series to these Gaussian white noise series. As in [31], the time delay and the embedding dimension are set as 1 and 6, respectively. The PE and WPE are calculated using a window function with a length of 500 and a sliding step of 50. The time-domain waveforms of the Gaussian white noise series and the signal plus additive pulse series are shown in Figure 5. The results of the calculated entropy are shown in Figure 6.
As shown in Figure 6, the WPE values of two signals of the SRN are smaller than the corresponding PE ones. This indicates how Gaussian white noise has a higher complexity and contains more information. There is no difference in the PE of the two signals in the pulse region, which means the inability of PE to distinguish between these two signals. This can be attributed to the neglect of the amplitude difference between neighboring vectors having the same ordinal patterns in PE calculation. As a result, PE performs poorly in effectively detecting fluctuations caused by noise. In contrast, the WPE value of the signal after superimposed pulses decreases significantly in the pulse region, indicating that the WPE can effectively detect the amplitude-encoded information contained in the signal due to the introduction of weights, which outperforms the PE in noise detection and fluctuation observation.  As shown in Figure 6, the WPE values of two signals of the SRN are smaller than the corresponding PE ones. This indicates how Gaussian white noise has a higher complexity and contains more information. There is no difference in the PE of the two signals in the pulse region, which means the inability of PE to distinguish between these two signals. This can be attributed to the neglect of the amplitude difference between neighboring vectors having the same ordinal patterns in PE calculation. As a result, PE performs poorly in effectively detecting fluctuations caused by noise. In contrast, the WPE value of the signal after superimposed pulses decreases significantly in the pulse region, indicating that the WPE can effectively detect the amplitude-encoded information contained in the signal due to the introduction of weights, which outperforms the PE in noise detection and fluctuation observation.  As shown in Figure 6, the WPE values of two signals of the SRN are smaller than the corresponding PE ones. This indicates how Gaussian white noise has a higher complexity and contains more information. There is no difference in the PE of the two signals in the pulse region, which means the inability of PE to distinguish between these two signals. This can be attributed to the neglect of the amplitude difference between neighboring vectors having the same ordinal patterns in PE calculation. As a result, PE performs poorly in effectively detecting fluctuations caused by noise. In contrast, the WPE value of the signal after superimposed pulses decreases significantly in the pulse region, indicating that the WPE can effectively detect the amplitude-encoded information contained in the signal due to the introduction of weights, which outperforms the PE in noise detection and fluctuation observation. For further comparison between the performance of the PE and WPE, 50 1/f noise samples with a length of 500 are generated and the pulse series is superimposed on the raw 1/f noise series. For a fair comparison, PE and WPE calculations concern the two signals. The time-domain waveforms of the analyzed signals and the scatter plots of the calculation results are given in Figures 7 and 8, respectively.
For further comparison between the performance of the PE and WPE, 50 1/f noise samples with a length of 500 are generated and the pulse series is superimposed on the raw 1/f noise series. For a fair comparison, PE and WPE calculations concern the two signals. The time-domain waveforms of the analyzed signals and the scatter plots of the calculation results are given in Figures 7 and 8, respectively.  As shown in the results, thanks to the good anti-noise ability of WPE, making the estimated WPE value of the two signals is lower than the PE value. Also, sudden change detection is a hard task in the PE method as it neglects the amplitude information. In contrast, except for the significant difference in WPE values between the two signals, the WPE For further comparison between the performance of the PE and WPE, 50 1/f noise samples with a length of 500 are generated and the pulse series is superimposed on the raw 1/f noise series. For a fair comparison, PE and WPE calculations concern the two signals. The time-domain waveforms of the analyzed signals and the scatter plots of the calculation results are given in Figures 7 and 8, respectively.  As shown in the results, thanks to the good anti-noise ability of WPE, making the estimated WPE value of the two signals is lower than the PE value. Also, sudden change detection is a hard task in the PE method as it neglects the amplitude information. In contrast, except for the significant difference in WPE values between the two signals, the WPE As shown in the results, thanks to the good anti-noise ability of WPE, making the estimated WPE value of the two signals is lower than the PE value. Also, sudden change detection is a hard task in the PE method as it neglects the amplitude information. In contrast, except for the significant difference in WPE values between the two signals, the WPE fluctuation trends are more dramatic than PE. Larger fluctuation means stronger discrimination ability. In short, the analysis of experimental results proves the advantages of WPE over PE.

Parameter Selection
The WPE calculation is dependent on the time delay τ and embedding dimension m. If m is too small, the reconstructed sequence will contain less state information, and the WPE algorithm cannot adequately detect the dynamic change of the time series. However, larger m indicates that the time series is homogenized by the reconstructed phase space and cannot detect the time series slight change. Also, if τ is too small, a strong correlation will occur between different delay vector elements, resulting in information redundancy. The phase space trajectory cannot be fully expanded when τ is too large. Therefore, such parameters should be adjusted first before WPE calculation.
To study the influence of m and τ on PE and WPE, the three types of SRN are randomly selected from the data set used in [32]. The samples were recorded on the Atlantic coast in north-western Spain 42 • 14 N, 008 • 43.4 W at a depth of 10 m. The sampling frequency is 52.734 kHz and the data length is set to be 5000. The three types of SRN signals are named class A, B, and C, respectively. Figure 9 shows the time-domain waveforms of the normalized signals.
over PE.

Parameter Selection
The WPE calculation is dependent on the time delay and embedding dimension . If is too small, the reconstructed sequence will contain less state information, and the WPE algorithm cannot adequately detect the dynamic change of the time series. However, larger indicates that the time series is homogenized by the reconstructed phase space and cannot detect the time series slight change. Also, if is too small, a strong correlation will occur between different delay vector elements, resulting in information redundancy. The phase space trajectory cannot be fully expanded when is too large. Therefore, such parameters should be adjusted first before WPE calculation.
To study the influence of and on PE and WPE, the three types of SRN are randomly selected from the data set used in [32]. The samples were recorded on the Atlantic coast in north-western Spain (42° 14 N, 008° 43.4 W) at a depth of 10 m. The sampling frequency is 52.734 kHz and the data length is set to be 5000. The three types of SRN signals are named class , , and , respectively. Figure 9 shows the time-domain waveforms of the normalized signals. As in [30], the PE and WPE of the three types of SRN signals are calculated under the condition m = 3, 4, 5, 6, 7, and the time delay ranges from 1 to 20. The results are shown in Figure 10. As seen in Figure 10, the class C SRN signal has the maximum entropy value and thus the highest complexity. As in [30], the PE and WPE of the three types of SRN signals are calculated under the condition m = 3, 4, 5, 6, 7, and the time delay ranges from 1 to 20. The results are shown in Figure 10. As seen in Figure 10, the class C SRN signal has the maximum entropy value and thus the highest complexity. Thanks to the excellent anti-noise performance of WPE, the WPE value of the same time series is less than the PE. WPE is more sensitive to the time delay compared to the PE, as the PE fluctuates subtly with the time delay increasing. The WPE and PE begin to separate when m = 6. However, when is increased to 7, the computational complexity will be increased without improving the accuracy of calculation results. Based on the results obtained in Figure 10, a significant difference occurs between the WPE and PE when the time delay is equal to 1. Therefore, in the proposed method when calculating the WPE and PE, the delay and embedding dimensions are set to be 1 and 6, respectively. These parameters are consistent with the recommendations given by [15,31]. The PE and WPE of the signals versus the time delay are shown in Figure 11. As shown in Figure 11, the time delay has a greater influence on PE and WPE in some ranges, while less in others as the embedding dimension increases. Compared to PE, the trends of WPE fluctuate relatively sharply, as the WPE is more sensitive to the pattern extracted from signals containing amplitude information. Thanks to the excellent anti-noise performance of WPE, the WPE value of the same time series is less than the PE. WPE is more sensitive to the time delay compared to the PE, as the PE fluctuates subtly with the time delay increasing. The WPE and PE begin to separate when m = 6. However, when m is increased to 7, the computational complexity will be increased without improving the accuracy of calculation results. Based on the results obtained in Figure 10, a significant difference occurs between the WPE and PE when the time delay is equal to 1. Therefore, in the proposed method when calculating the WPE and PE, the delay and embedding dimensions are set to be 1 and 6, respectively. These parameters are consistent with the recommendations given by [15,31]. The PE and WPE of the signals versus the time delay are shown in Figure 11. As shown in Figure 11, the time delay has a greater influence on PE and WPE in some ranges, while less in others as the embedding dimension increases. Compared to PE, the trends of WPE fluctuate relatively sharply, as the WPE is more sensitive to the pattern extracted from signals containing amplitude information.

Decomposition of Ship-Radiated Noise Using VMD
As described in section III, to calculate the VMD model number, the EEMD algor is first employed to decompose the SRN signals. The decomposition results are prese in Figure 12. Figure 12 shows that 12 IMFs and one residual component are obtained the EEMD of each type of SRN signal. For the sake of observation, K is set as (2~15 calculate the variance of the IMFs' center frequency. The results are given in Figure  can be observed from Figure 13 that the optimal K is 12, and when K is higher than 12 variance starts to decrease sharply, implying the occurrence of over-decomposition decomposition results by EVMD are given in Figure 14.

Decomposition of Ship-Radiated Noise Using VMD
As described in Section 3, to calculate the VMD model number, the EEMD algorithm is first employed to decompose the SRN signals. The decomposition results are presented in Figure 12. Figure 12 shows that 12 IMFs and one residual component are obtained after the EEMD of each type of SRN signal. For the sake of observation, K is set as ( 2 ∼ 15) to calculate the variance of the IMFs' center frequency. The results are given in Figure 13. It can be observed from Figure 13 that the optimal K is 12, and when K is higher than 12, the variance starts to decrease sharply, implying the occurrence of over-decomposition. The decomposition results by EVMD are given in Figure 14.

Classification of Ship-Radiated Noise (SRN)
In this section, the proposed method is applied to the SRN samples classification. 100 samples are randomly selected from each type of SRN sample and thus a total of 300 samples can be obtained. The EVMD is first performed to decompose these samples and the WPE of each IMF is calculated. Figure 15 shows the WPE mean and standard deviation. Figure 15 shows, when the mode is IMF3 or IMF4, that the three types of samples can be distinguished, and the WPE values of class C samples are higher than that of the other two classes. This can be due to the dynamic behavior changes at different signals. Hence, WPE can effectively reflect the dynamic changes of the time series. However, when the mode is IMF2, the class A and B samples cannot be identified; when IMF5, only class A can be identified. In other modes, all three types of sample are indistinguishable. Therefore, not every IMF can fully characterize the raw SRN signal after the proposed EVMD decomposition. The results can be attributed to two points. Firstly, due to the pollution of marine environmental noise, some IMFs belong to noise or noise-dominant components. Secondly, the occurrence of over-decomposition in the VMD algorithm allows different IMFs to share the same spectrum information. In this way, the dimension reduction algorithms are introduced to avoid dimensional disasters, thus preparing for the classification below. Next, the PCA, MDS, LLE, and LTSA methods are utilized for the low-dimensional feature extraction. The number of neighboring points and the target dimension are set as 15 and 2, respectively. The results are given in Figure   Figure 14. The EVMD results for SRN signals.

Classification of Ship-Radiated Noise (SRN)
In this section, the proposed method is applied to the SRN samples classification. 100 samples are randomly selected from each type of SRN sample and thus a total of 300 samples can be obtained. The EVMD is first performed to decompose these samples and the WPE of each IMF is calculated. Figure 15 shows the WPE mean and standard deviation.

Classification of Ship-Radiated Noise (SRN)
In this section, the proposed method is applied to the SRN samples classification. 100 samples are randomly selected from each type of SRN sample and thus a total of 300 samples can be obtained. The EVMD is first performed to decompose these samples and the WPE of each IMF is calculated. Figure 15 shows the WPE mean and standard deviation. Figure 15 shows, when the mode is IMF3 or IMF4, that the three types of samples can be distinguished, and the WPE values of class C samples are higher than that of the other two classes. This can be due to the dynamic behavior changes at different signals. Hence, WPE can effectively reflect the dynamic changes of the time series. However, when the mode is IMF2, the class A and B samples cannot be identified; when IMF5, only class A can be identified. In other modes, all three types of sample are indistinguishable. Therefore, not every IMF can fully characterize the raw SRN signal after the proposed EVMD decomposition. The results can be attributed to two points. Firstly, due to the pollution of marine environmental noise, some IMFs belong to noise or noise-dominant components. Secondly, the occurrence of over-decomposition in the VMD algorithm allows different IMFs to share the same spectrum information. In this way, the dimension reduction algorithms are introduced to avoid dimensional disasters, thus preparing for the classification below. Next, the PCA, MDS, LLE, and LTSA methods are utilized for the low-dimensional feature extraction. The number of neighboring points and the target dimension are set as 15 and 2, respectively. The results are given in Figure   Figure 15. The mean and standard deviation of WPE. Figure 15 shows, when the mode is IMF3 or IMF4, that the three types of samples can be distinguished, and the WPE values of class C samples are higher than that of the other two classes. This can be due to the dynamic behavior changes at different signals. Hence, WPE can effectively reflect the dynamic changes of the time series. However, when the mode is IMF2, the class A and B samples cannot be identified; when IMF5, only class A can be identified. In other modes, all three types of sample are indistinguishable. Therefore, not every IMF can fully characterize the raw SRN signal after the proposed EVMD decomposition. The results can be attributed to two points.
Firstly, due to the pollution of marine environmental noise, some IMFs belong to noise or noise-dominant components. Secondly, the occurrence of over-decomposition in the VMD algorithm allows different IMFs to share the same spectrum information. In this way, the dimension reduction algorithms are introduced to avoid dimensional disasters, thus preparing for the classification below. Next, the PCA, MDS, LLE, and LTSA methods are utilized for the low-dimensional feature extraction. The number of neighboring points and the target dimension are set as 15 and 2, respectively. The results are given in Figure 16. As shown in Figure 16, overall, most of the three types of samples can be distinguished by the four algorithms despite the partially overlapping samples between classes A and B. From a qualitative point of view, the combined EVMD-WPE-PCA algorithm has shown the worst performance with a small number of samples crossing between class A and B, and more samples overlapping between class B and C, while the performance of EVMD-WPE-MDS and EVMD-WPE-LLE has been significantly improved compared to EVMD-WPE-PCA with only a few samples of class A and B crossed. Despite the good performance of EVMD-WPE-PCA, EVMD-WPE-MDS, and EVMD-WPE-LLE in roughly identifying the three types of samples, the degree of clustering within the class is still small, especially for class B. In contrast, the proposed combined EVMD-WPE-LTSA method has the best clustering performance. The three kinds of SRN sample can be separated well, and the samples of classes B and C are better clustered. For a fair comparison, the scatter plots of WPE, EMD-WPE-LTSA and EEMD-WPE-LTSA are represented in Figure 17.
As shown in Figure 17a, affected by the oceanic environment and the ambient noise, we cannot recognize the majority of the three forms of samples as they will be opposed together. Hence in such a case, the algorithms of the decomposition should be used. For the EMD-WPE-LTSA shown in Figure 16b, a large proportion of samples of classes A and B are overlapped, which cannot be separated well. The samples in the C category are also intersected with the others. Hence, recognition of the three samples becomes difficult. While concerning the EEMD-WPE-LTSA, the samples in categories A and B can be completely separated, but the degree of separation between categories B and C is still small. From a qualitative point of view, there is no significant difference between EEMD-WPE-LTSA and EVMD-WPE-LTSA, and EEMD-WPE-LTSA even looks more clustered within the class, but we cannot conclude on the merits of both algorithms. Data visualization is only a qualitative tool rather than a quantitative one. In this situation, the PSO-SVM multi-class classifier is introduced to compare the algorithms above more accurately. Next, 60 samples are randomly selected from each class to be used as a training classifier and the remaining samples are left to be used in testing the performance. Also, both EMD-EIMF-PE [21] and VMD-WPE-LTSA (the mode number of VMD is set as [4,20]) methods are considered. The classification outputs are shown in Figure 18. Table 3 lists the classification accuracy and computational time for SRN feature extraction under different algorithms.  16. As shown in Figure 16, overall, most of the three types of samples can be distinguished by the four algorithms despite the partially overlapping samples between classes A and B. From a qualitative point of view, the combined EVMD-WPE-PCA algorithm has shown the worst performance with a small number of samples crossing between class A and B and more samples overlapping between class B and C, while the performance of EVMD WPE-MDS and EVMD-WPE-LLE has been significantly improved compared to EVMD WPE-PCA with only a few samples of class A and B crossed. Despite the good perfor mance of EVMD-WPE-PCA, EVMD-WPE-MDS, and EVMD-WPE-LLE in roughly identi fying the three types of samples, the degree of clustering within the class is still small especially for class B. In contrast, the proposed combined EVMD-WPE-LTSA method ha the best clustering performance. The three kinds of SRN sample can be separated well and the samples of classes B and C are better clustered. For a fair comparison, the scatte plots of WPE, EMD-WPE-LTSA and EEMD-WPE-LTSA are represented in Figure 17. As shown in Figure 17a, affected by the oceanic environment and the ambient noise we cannot recognize the majority of the three forms of samples as they will be opposed together. Hence in such a case, the algorithms of the decomposition should be used. Fo the EMD-WPE-LTSA shown in Figure 16b, a large proportion of samples of classes A and B are overlapped, which cannot be separated well. The samples in the C category are also intersected with the others. Hence, recognition of the three samples becomes difficult While concerning the EEMD-WPE-LTSA, the samples in categories A and B can be com pletely separated, but the degree of separation between categories B and C is still small From a qualitative point of view, there is no significant difference between EEMD-WPE LTSA and EVMD-WPE-LTSA, and EEMD-WPE-LTSA even looks more clustered within the class, but we cannot conclude on the merits of both algorithms. Data visualization i only a qualitative tool rather than a quantitative one. In this situation, the PSO-SVM multi class classifier is introduced to compare the algorithms above more accurately. Next, 60 samples are randomly selected from each class to be used as a training classifier and the remaining samples are left to be used in testing the performance. Also, both EMD-EIMF PE [21] and VMD-WPE-LTSA (the mode number of VMD is set as [4,20]) methods are considered. The classification outputs are shown in Figure18. Table 3 lists the classification accuracy and computational time for SRN feature extraction under different algorithms.   As shown in Figure 18 and Table 3, the direct calculation of the WPE for the samples fails to achieve the SRN samples identification as it has only a 62.5% recognition rate and that is far from the classification standard. The multistage classification techniques based on the proposed EVMD have the highest classification accuracy of 95.8333% and 96.6667% in EVMD-WPE-LLE and EVMD-WPE-LTSA, respectively. The proposed multistage classification techniques based on EVMD outperform the other conventional algorithms with the additional computational time cost. In general, computational complexity and classification accuracy are determined by the configuration of the hardware and algorithm design. that is far from the classification standard. The multistage classification techniques based on the proposed EVMD have the highest classification accuracy of 95.8333% and 96.6667% in EVMD-WPE-LLE and EVMD-WPE-LTSA, respectively. The proposed multistage classification techniques based on EVMD outperform the other conventional algorithms with the additional computational time cost. In general, computational complexity and classification accuracy are determined by the configuration of the hardware and algorithm design.

Conclusions
A novel multi-stage feature extraction method for underwater acoustic signals is proposed in this paper based on combining the new EVMD method with WPE and LTSA. The main innovations and contributions of this work can be summarized as follows:

Conclusions
A novel multi-stage feature extraction method for underwater acoustic signals is proposed in this paper based on combining the new EVMD method with WPE and LTSA. The main innovations and contributions of this work can be summarized as follows: (1) The proposed EVMD method mode number was calculated based on the variance of IMFs' center frequency. (2) WPE and LTSA were combined with the new EVMD and applied to underwater SRN signals for the first time. (3) The proposed EVMD-WPE-LTSA was successfully applied, and extracted the features of the underwater acoustic signals. Compared with other algorithms, EVMD-WPE-LTSA had a higher recognition rate and better classification performance at the cost of computational time.
The EVMD algorithm proposed in this paper to overcome the shortage of the VMD performed accurately in the field of underwater acoustic communication. Nevertheless, this paper only addresses the VMD mode number. In future, the optimization between both the model number and the quadratic penalty term will be considered to obtain high decomposition accuracy for the EVMD. Also, we will try to reduce the computation complexity.